如何从每个轨迹数据中提取尽可能多的学习信号是强化学习(RL)中的关键问题,其中样本效率低下对实际应用构成了严重挑战。最近的作品表明,使用表现力的政策函数近似器和对未来轨迹信息的调理 - 例如在决策变压器(DT)中重播或退回的未来状态 - 可以高效地学习多任务策略,在哪里有时在线RL被离线行为克隆完全替换,例如序列建模。我们展示所有这些方法都正在进行后视信息匹配(他) - 培训策略,可以输出与未来状态信息的一些统计数据匹配的轨迹的其余轨迹。我们呈现出用于解决任何问题的广义决策变压器(GDT),并显示特征功能的选择和抗因果聚合器的不同选择性不仅恢复DT为特殊情况,而且还导致新的分类DT(CDT)和BI - 用于匹配未来不同统计数据的DT(BDT)。为了评估CDT和BDT,我们将离线多任务状态边缘匹配(SMM)和仿制学习(IL)定义为两个普遍的他问题,提出了Wasserstein距离损失作为两者的度量,并对Mujoco连续控制进行了经验研究它们基准。 CDT简单地取代了DT中的反因果衬合的反因果求和,使得第一种有效的离线多任务SMM算法概括为看不见甚至合成的多模态状态特征分布。使用反因果第二变压器作为聚合器的BDT可以学习模拟未来的任何统计数据,并在离线多任务IL中占DT变体。我们的广义配方来自他和GDT大大扩大了强大的序列建模架构在现代RL中的作用。
translated by 谷歌翻译
在本研究中,我们提出了一种基于病例的新型图像检索(SIR)方法,用于苏木精和曙红(H&E)染色的恶性淋巴瘤的组织病理学图像。当将整个幻灯片图像(WSI)用作输入查询时,希望能够通过重点关注病理上重要区域(例如肿瘤细胞)中的图像斑块来检索相似情况。为了解决这个问题,我们采用了基于注意力的多个实例学习,这使我们能够在计算案例之间的相似性时专注于肿瘤特异性区域。此外,我们采用对比度距离度量学习将免疫组织化学(IHC)染色模式纳入有用的监督信息,以定义异质性恶性淋巴瘤病例之间的适当相似性。在对249例恶性淋巴瘤患者的实验中,我们证实该方法比基线基于病例的SIR方法表现出更高的评估措施。此外,病理学家的主观评估表明,我们使用IHC染色模式的相似性度量适用于代表恶性淋巴瘤H&E染色组织图像的相似性。
translated by 谷歌翻译
A practical issue of edge AI systems is that data distributions of trained dataset and deployed environment may differ due to noise and environmental changes over time. Such a phenomenon is known as a concept drift, and this gap degrades the performance of edge AI systems and may introduce system failures. To address this gap, a retraining of neural network models triggered by concept drift detection is a practical approach. However, since available compute resources are strictly limited in edge devices, in this paper we propose a lightweight concept drift detection method in cooperation with a recently proposed on-device learning technique of neural networks. In this case, both the neural network retraining and the proposed concept drift detection are done by sequential computation only to reduce computation cost and memory utilization. Evaluation results of the proposed approach shows that while the accuracy is decreased by 3.8%-4.3% compared to existing batch-based detection methods, it decreases the memory size by 88.9%-96.4% and the execution time by 1.3%-83.8%. As a result, the combination of the neural network retraining and the proposed concept drift detection method is demonstrated on Raspberry Pi Pico that has 264kB memory.
translated by 谷歌翻译
Owing to the widespread adoption of the Internet of Things, a vast amount of sensor information is being acquired in real time. Accordingly, the communication cost of data from edge devices is increasing. Compressed sensing (CS), a data compression method that can be used on edge devices, has been attracting attention as a method to reduce communication costs. In CS, estimating the appropriate compression ratio is important. There is a method to adaptively estimate the compression ratio for the acquired data using reinforcement learning. However, the computational costs associated with existing reinforcement learning methods that can be utilized on edges are expensive. In this study, we developed an efficient reinforcement learning method for edge devices, referred to as the actor--critic online sequential extreme learning machine (AC-OSELM), and a system to compress data by estimating an appropriate compression ratio on the edge using AC-OSELM. The performance of the proposed method in estimating the compression ratio is evaluated by comparing it with other reinforcement learning methods for edge devices. The experimental results show that AC-OSELM achieved the same or better compression performance and faster compression ratio estimation than the existing methods.
translated by 谷歌翻译
IR models using a pretrained language model significantly outperform lexical approaches like BM25. In particular, SPLADE, which encodes texts to sparse vectors, is an effective model for practical use because it shows robustness to out-of-domain datasets. However, SPLADE still struggles with exact matching of low-frequency words in training data. In addition, domain shifts in vocabulary and word frequencies deteriorate the IR performance of SPLADE. Because supervision data are scarce in the target domain, addressing the domain shifts without supervision data is necessary. This paper proposes an unsupervised domain adaptation method by filling vocabulary and word-frequency gaps. First, we expand a vocabulary and execute continual pretraining with a masked language model on a corpus of the target domain. Then, we multiply SPLADE-encoded sparse vectors by inverse document frequency weights to consider the importance of documents with lowfrequency words. We conducted experiments using our method on datasets with a large vocabulary gap from a source domain. We show that our method outperforms the present stateof-the-art domain adaptation method. In addition, our method achieves state-of-the-art results, combined with BM25.
translated by 谷歌翻译
To ensure the safety of railroad operations, it is important to monitor and forecast track geometry irregularities. A higher safety requires forecasting with a higher spatiotemporal frequency. For forecasting with a high spatiotemporal frequency, it is necessary to capture spatial correlations. Additionally, track geometry irregularities are influenced by multiple exogenous factors. In this study, we propose a method to forecast one type of track geometry irregularity, vertical alignment, by incorporating spatial and exogenous factor calculations. The proposed method embeds exogenous factors and captures spatiotemporal correlations using a convolutional long short-term memory (ConvLSTM). In the experiment, we compared the proposed method with other methods in terms of the forecasting performance. Additionally, we conducted an ablation study on exogenous factors to examine their contribution to the forecasting performance. The results reveal that spatial calculations and maintenance record data improve the forecasting of the vertical alignment.
translated by 谷歌翻译
Embodied Instruction Following (EIF) studies how mobile manipulator robots should be controlled to accomplish long-horizon tasks specified by natural language instructions. While most research on EIF are conducted in simulators, the ultimate goal of the field is to deploy the agents in real life. As such, it is important to minimize the data cost required for training an agent, to help the transition from sim to real. However, many studies only focus on the performance and overlook the data cost -- modules that require separate training on extra data are often introduced without a consideration on deployability. In this work, we propose FILM++ which extends the existing work FILM with modifications that do not require extra data. While all data-driven modules are kept constant, FILM++ more than doubles FILM's performance. Furthermore, we propose Prompter, which replaces FILM++'s semantic search module with language model prompting. Unlike FILM++'s implementation that requires training on extra sets of data, no training is needed for our prompting based implementation while achieving better or at least comparable performance. Prompter achieves 42.64% and 45.72% on the ALFRED benchmark with high-level instructions only and with step-by-step instructions, respectively, outperforming the previous state of the art by 6.57% and 10.31%.
translated by 谷歌翻译
长尾数据集(Head Class)组成的培训样本比尾巴类别多得多,这会导致识别模型对头等舱有偏见。加权损失是缓解此问题的最受欢迎的方法之一,最近的一项工作表明,班级难度可能比常规使用的类频率更好地决定了权重的分布。在先前的工作中使用了一种启发式公式来量化难度,但是我们从经验上发现,最佳公式取决于数据集的特征。因此,我们提出了困难网络,该难题学习在元学习框架中使用模型的性能来预测类的难度。为了使其在其他班级的背景下学习班级的合理难度,我们新介绍了两个关键概念,即相对难度和驾驶员损失。前者有助于困难网络在计算班级难度时考虑其他课程,而后者对于将学习指向有意义的方向是必不可少的。对流行的长尾数据集进行了广泛的实验证明了该方法的有效性,并且在多个长尾数据集上实现了最先进的性能。
translated by 谷歌翻译
联合学习是一种机器学习方法,其中未在服务器上汇总数据,而是根据安全性和隐私性分配给边缘。 Resnet是一个经典但代表性的神经网络,通过学习将输入和输出加在一起的残留功能,成功地加深了神经网络。在联合学习中,服务器和边缘设备之间执行交流以交换权重参数,但是Resnet具有深层和大量参数,因此通信大小变得很大。在本文中,我们将神经颂歌用作重新设计的轻量级模型,以减少联合学习中的沟通规模。此外,我们使用具有不同数量的迭代的神经ODE模型新引入了灵活的联合学习,这与具有不同深度的重新连接相对应。 CIFAR-10数据集用于评估中,与RESNET相比,神经ODE的使用将通信大小降低了约90%。我们还表明,提出的灵活联合学习可以与不同的迭代计数合并模型。
translated by 谷歌翻译
本文介绍了素描的现实,这种方法结合了AR素描和驱动的有形用户界面(TUI),用于双向素描交互。双向草图使虚拟草图和物理对象通过物理驱动和数字计算相互影响。在现有的AR素描中,虚拟世界和物理世界之间的关系只是一个方向 - 虽然物理互动会影响虚拟草图,但虚拟草图对物理对象或环境没有返回效果。相反,双向素描相互作用允许草图和驱动的tuis之间的无缝耦合。在本文中,我们采用桌面大小的小型机器人(Sony Toio)和基于iPad的AR素描工具来演示该概念。在我们的系统中,在iPad上绘制和模拟的虚拟草图(例如,线,墙壁,摆和弹簧)可以移动,动画,碰撞和约束物理Toio机器人,就像虚拟草图和物理对象存在于同一空间中一样通过AR和机器人运动之间的无缝耦合。本文贡献了一组新型的互动和双向AR素描的设计空间。我们展示了一系列潜在的应用,例如有形的物理教育,可探索的机制,儿童有形游戏以及通过素描的原位机器人编程。
translated by 谷歌翻译